Skip to content

Conversation

gmarouli
Copy link
Contributor

@gmarouli gmarouli commented Oct 2, 2025

In this PR we move the force-merge operation from the downsampling request to the ILM action.

Our goal is to decouple the downsampling operation from the force-merge operation. With this change the downsampling request is responsible to ensure that the downsampled index is refreshed and flushed but not to force merge it.

We believe that most of the time this is not necessary, and executing the force-merge operation unnecessarily can increase the load on the cluster.

To preserve backwards compatibility we move the responsibility to execute the existing force merge to the downsample ILM action and we make it configurable. By default, it will run but a user can disable it just as they can with a searchable snapshot.

"downsample": {
  "fixed_interval": "1h",
  "force_merge_index": false
}

Update

As a follow up of this PR, we pose the question is the force merge in the downsample action intentional and useful?

To answer this question, we extend time series telemetry. We define that the force merge step in the downsample ILM action is useful, if this is the only force merge step operation before a searchable snapshot.

Effectively, by this definition, we argue that the force merge in downsampling is not an intentional operation the user has requested but only the result of the implementation. We identify the biggest impact of removing it to be a searchable snapshot, but if the searchable snapshot performs its own force merge (and more performant force merge #133954) then we could skip this operation in the downsample action altogether.

Fixes: #135618

Copy link
Contributor

github-actions bot commented Oct 2, 2025

🔍 Preview links for changed docs

Copy link
Contributor

github-actions bot commented Oct 2, 2025

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@gmarouli gmarouli added :Data Management/ILM+SLM Index and Snapshot lifecycle management :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data labels Oct 2, 2025
@gmarouli
Copy link
Contributor Author

gmarouli commented Oct 2, 2025

As it says in the description, I have one more idea for a test but I thought of it after I requested your review. So, no rush tomorrow the final test should be there too.

This idea won't work, because we cannot influence the segments of the downsampled index.

@gmarouli gmarouli marked this pull request as ready for review October 2, 2025 15:11
@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team Team:StorageEngine labels Oct 2, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Contributor

@nielsbauman nielsbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, just left a few small comments :)

private boolean isForceMergeEnabled(Map<String, Object> sourceIndexMappings) {
if (sourceIndexMappings.containsKey("_meta")) {
if (sourceIndexMappings.get("_meta") instanceof Map<?, ?> metadataMap) {
var enabledForceMergeValue = metadataMap.get("downsample.forcemerge.enabled");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double-checking: do we have any documentation on this "setting"? It's in a weird place so I doubt we care much about it, but just wanted to double-check that we're ok with not supporting this configuration any more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting is not documented on purpose. We added it to allow users that cannot upgrade yet from 8.19, to test the impact of skipping the force-merge request. It's just a workaround since the proper way, which we implement with this PR, cannot be backported.

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM also in general. I have the same question as Niels about not looking at the metadata any more, is that going to be considered a breaking change, since a user that has that configured as false would start having their downsampled indices force merged after this change?

@gmarouli
Copy link
Contributor Author

gmarouli commented Oct 3, 2025

LGTM also in general. I have the same question as Niels about not looking at the metadata any more, is that going to be considered a breaking change, since a user that has that configured as false would start having their downsampled indices force merged after this change?

This is an undocumented workaround to allow users in 8.19.x to experiment with disabling force merge and see the impact. Considering this, I think this change is reasonable. When they upgrade they can update their policies to skip the force-merge the right "way".

@elasticsearchmachine
Copy link
Collaborator

Hi @gmarouli, I've created a changelog YAML for you.

@gmarouli
Copy link
Contributor Author

gmarouli commented Oct 8, 2025

@dakrone and/or @nielsbauman can you take another look?

@gmarouli gmarouli self-assigned this Oct 8, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @gmarouli, I've updated the changelog YAML for you.

gmarouli added a commit to elastic/docs-content that referenced this pull request Oct 9, 2025
In this PR, we propose to add practical tips for downsampling. For now
this includes, a guideline on how to choose the downsampling interval.
And then specifically for ILM, an explanation on how downsampling
relates with tiers. After
elastic/elasticsearch#135834, we should also add
here the option to disable force merge.

---------

Co-authored-by: Kostas Krikellas <[email protected]>
Co-authored-by: Marci W <[email protected]>
Copy link
Contributor

@nielsbauman nielsbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The only recommendation I have left is to run the test classes you changed a few thousand iterations, to ensure there's no combination of random values that results in a test failure.

Apologies for the late review, and thanks a lot for working and iterating on this!

Copy link
Contributor

@nielsbauman nielsbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one small typo suggestion, other than that LGTM!

Co-authored-by: Niels Bauman <[email protected]>
@gmarouli
Copy link
Contributor Author

Thank you for the review @nielsbauman , no worries about the delay, there were many things in progress so I wasn't blocked!

@gmarouli gmarouli added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Oct 14, 2025
@elasticsearchmachine elasticsearchmachine merged commit b0ef2b4 into elastic:main Oct 14, 2025
34 checks passed
@gmarouli gmarouli deleted the downsampling/move-force-merge-to-ilm branch October 14, 2025 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data Team:Data Management Meta label for data/management team Team:StorageEngine v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Downsampling] Move force merge out of the downsampling API and into the downsample ILM action.

4 participants